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Extensible Markup Language, or XML, is poised to become the standard markup 
language used to construct Web pages on the World Wide Web. Extensible Markup 
Language incorporates components of both Standard Generalized Markup Language 
(SGML) and HyperText Markup Language (HTML), resulting in a flexible language that 
is user-friendly and supports many different applications. 

First, it is essential to understand how a computer "reads" a Web page. HyperText 
Markup Language displays pages on the World Wide Web by tagging different elements 
in a document (Webopedia, 1999). Because HTML is a very basic language, the 
demand for formatting data rather than just displaying it has surpassed HTML's 
capabilities. Therefore, XML has been introduced as a possible solution to the 
increased demand for formatted information on Web pages. XML provides a standard 
for Web authors that can be read by different browsers and different computer 
platforms. Extensible Markup Language seeks to do away with vendor-specific markup 
language (compatible with only Internet Explorer or Netscape Navigator, for example). 
Extensible Markup Language will make the Web a more efficient education tool 
because it will allow for more accurate searching. The data in XML Web pages will be 
structured and not just displayed. 

WHAT IS A MARKUP LANGUAGE? 



Simply stated, a Web page must be written in a markup language for a computer Web 
browser to interpret how to display that page. Standard Generalized Markup Language 
(SGML) is a complex language that allows a programmer to format documents. 
HyperText Markup Language is a language described in SGML, and widely regarded as 
the standard for Web publishing. HyperText Markup Language is quite austere 
compared to SGML, and therefore limited. HyperText Markup Language uses tags to 
describe how data will be presented on a Web page. For instance, the tag element 
<BOLD> is used to make text appear in boldface (Bosak, 1999). Of course, the Web is 
a dynamic environment, and new demands are made of HTML all the time. As more 
elements are added to HTML, problems arise with browser compatibility. Something 
that works well in Netscape Navigator might fail miserably in Internet Explorer. 

Also, HTML can make an attractive Web page fairly easily, but it doesn't tell the 
computer a thing about content. With web sites proliferating at an astounding rate, the 
need presents itself for a markup language that is both multi-browser compatible and 
capable of formatting data so that information on the World Wide Web is found more 
quickly and easily. Therefore, XML was developed. Because XML is not as pared down 
as HTML, it can use the complexity of SGML to make Web pages more active. The 
result will be a faster World Wide Web, with more reliable search results. 

HOW XML WORKS 



Extensible Markup Language allows a person to invent an array of tags to describe their 
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text document (Bray, 1997). In HTML, there are a limited number of tags, such as 
<BOLD> or <ITALIC>, and these tags format text-that's it. In XML, a person could invent 
a set of tags to describe, for instance, a lesson plan. Such a set of tags might look 
something like this: 



cLESSON PLAN> 



<SUBJECT>English Literature</SUBJECT> 



<TITLE>An Introduction to Shakespeare</TITLE> 



<CONCEPTS> 



<P>The main concepts covered in this lesson are the life of William Shakespeare (i.e., 
his childhood, early acting career, life as a playwright, his personal life) and the 
Elizabethan Era. </P> 



</CONCEPTS> 



</LESSON PLAN> 

If an English teacher wanted to mine the Web for lesson plans, XML would allow search 
engines to conduct a much more productive search based on the tags used, similar to 
those illustrated above. 

Suppose an educator was interested in developing a lesson plan on the life of William 
Shakespeare. Entering the words "William Shakespeare" in a typical search engine now 
could result in thousands and thousands of hits, with relatively few of educational value. 
With XML, search engines will search both the tags and the content of the page, thus 
bringing up "Lesson Plan" or "Literature," and winnowing the search results to the rich, 
relevant data needed. This type of tagging is referred to as metadata, or literally, "data 
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about data." In the same way, it would be much easier to find information about the 
movie Shakespeare In Love, because the metatag for that site would be <MOVIE>, or 
something similarly descriptive. 

The Gateway to Educational Materials (GEM) Project is an online ERIC resource for 
Internet based lesson plans and curriculum units. GEM will be able to build a set of XML 
tags which specify exactly how the Web pages for these educational materials should 
be put together. As a result, a standard will be developed, not only in how the pages 
appear to the user, but in how the search engines interpret the data that they contain. 
HyperText Markup Language will give the GEM project a distinctive "look" via images, 
colors, and fonts. More importantly, XML will create a standard for how the GEM 
information is formatted, much as described previously with the lesson plans. The key 
concept here is the containment of data. Being able to find the data on a Web site in an 
organized fashion greatly increases the value of that Web site-and XML can do this. 

CUSTOMIZING XML FOR INDIVIDUAL NEEDS 



Now, chaos could easily erupt if everyone in charge of a website decided to arbitrarily 
design his or her own set of metatags as descriptors. However, the potential for specific 
groups of people, such as educators or those at the GEM project, to customize their 
own particular sets of elements is enormous. When a set of metatags is developed for a 
particular interest group, it is referred to as a Document Type Definition (DTD). By 
fashioning a DTD, a formal set of markup elements can be developed as a standard for 
professionals in a particular field. The DTD names the elements and defines what, 
where, and how they may be used (Flynn, 1999). The DTD will also tell the author what 
tags are acceptable, how the tags must be arranged within each other, and in what 
order they need to appear. The process is similar to preparing a composition paper. A 
teacher giving a writing assignment would expect that the paper's introduction would 
come first, then the body, followed by the conclusion. She would expect students to 
place clauses inside sentences, and sentences inside paragraphs. The students would 
be required to use this DTD, but they could fill in their own "data." If this were a DTD for 
a history class, then the content of the paper would have something to do with history. 
The DTD holds many implications for streamlining data that come from many resources 
but relate to a particular thing. The student information form, which college freshmen fill 
out as they enter college for the first time, could be completed using a form on various 
web sites using the same DTD. Because all the data would be housed exactly the same 
way, it could be much more easily mined for important information about this group of 
students. Instead of having to manipulate huge sets of raw data, researchers would find 
the data already organized in a predetermined way. 

MAKING IT WORK ON THE WORLD WIDE WEB 



Extensible Markup Language is still being adapted to the limitations of browsers. 
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Originally, XML required the use of Cascading Style Sheets (CSS). In essence, CSS 
allows Web authors to write their own markup language to determine how the content of 
a particular page will be displayed. The Web author can write a piece of markup code, 
for instance: H2 (font: 24pt Helvetica; font-weight: bold;). The code is contained within 
the style sheet, which means that every time the author uses the HTML H2 in the body 
of the page, it will automatically be 24pt bold Helvetica. By using CSS, the Web author 
needs to define his or her expectations for H2 only once (within the style sheet) instead 
of every time it occurs in the body of the Web page. 

Unfortunately, CSS commonly fails with today's browsers. A style sheet that works for 
Navigator might not work in Explorer, and vice versa. A font, such as Helvetica or Arial, 
might be specific to only one of the browsers. This might seriously impact the 
appearance of the Web page to any users on another browser. Moreover, older 
versions of browsers will not be able to handle CSS, so it is important for Web authors 
to consider how many of their potential users will be on older browsers. 

SUMMARY 



For anyone who has ever dabbled in Web authoring, the reassuring news is that XML 
promises to be just as easy to learn as HTML. The biggest change is that the Web 
author must write or borrow a DTD before beginning. As XML becomes more pervasive, 
expect to find DTDs readily available in a variety of subject matters. Cascading Style 
Sheets are also relatively simple to learn and use, and pages use less bandwidth 
because specifics about certain tags are contained within the style sheet instead of 
throughout the body of the Web page. Because XML is still a relatively new 
development, browsers are not yet being marketed as XML-compatible. HTML and 
SGML documents will still be viewable while browsers begin to implement XML (Flynn, 
1999). Extensible Markup Language holds great promise for organizing data on the 
World Wide Web. Its capacity for formatting data will be a great leap forward for all 
those who are connected to the Internet, either as Web authors or Web users. 
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